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ABSTRACT 


Microelectromechanical systems (MEMS) are a broad and rapidly expanding field that is currently 
receiving a great deal of attention because of the potential to significantly improve the ability to sense, 
analyze, and control a variety of processes. These processes are as varied as heating and ventilation 
systems, automobiles, medicine, aeronautical flight, military surveillance, weather forecasting, and space 
exploration. MEMS are a class of systems that are physically very small (micron level) and are a blend 
of electrical and mechanical components — similar to ICs, but including both electrical and mechanical 
systems on one chip. 

This research establishes reliability estimation and prediction for MEMS devices at the conceptual design 
phase using neural networks. At the conceptual design phase of a project, before the MEMS devices are 
actually built and tested, traditional methods of quantifying reliability are inadequate because the device 
is not in existence and cannot be tested to establish the reliability distributions. A novel approach using 
neural networks is created to predict the overall reliability of a MEMS device based on its components 
and each component’s attributes. 

The methodology begins with collecting attribute data (fabrication process, physical specifications, 
operating environment, property characteristics, packaging, etc.) and reliability data for many types of 
microengines developed by Sandia National Laboratories in Albuquerque, New Mexico (the only source 
for MEMS reliability data in sufficient quantity). These data are partitioned into training data (the 
majority) and validation data (the remainder). A neural network is applied to the training data (both 
attribute and reliability); the attributes become the system inputs and reliability data (cycles to failure), 
the system output. After the neural network is trained with sufficient data, the validation data are used to 
verify that the neural networks provided accurate reliability estimates. Now, the reliability of a new 
proposed MEMS device can be estimated by using the appropriate trained neural networks developed in 
this work. 
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SECTION 1: INTRODUCTION 


Research institutions and commercial laboratories are fabricating revolutionary new devices that may 
become one of the key defining technologies of the upcoming decade. These devices, known as 
microelectromechanical systems (MEMS), are a class of semiconductor devices that use both mechanical 
and electrical systems at a microscopic scale. MEMS are essentially a hybrid of electrical and 
mechanical systems only visible using a microscope. These devices are miniature in size, even compared 
to a microscopic dust mite, see Figure 1 .' In the MEMS environment, gravity and inertia are no longer 
controlling, but rather the effects of atomic forces and surface science dominate (Sandia, 1997). MEMS 
devices are generally batch-fabricated, tens of thousands at a time, with economies of scale significantly 
reducing unit cost (Rai-Choudhury, 1997). In addition, the MEMS process can create highly reliable 
systems with precision (Tanaka, et al., 1995). 

U BACKGROUND 

Over these past four decades, there has been an exponential growth in the number of transistors 
incorporated on a single piece of silicon (each with increased performance and capability), while an 
exponential decrease in the cost per unit of these devices (Rai-Choudhury, 1997). These exponential 
jumps are attributable to vast improvements in the manufacturing process control, cleanliness, critical 
dimension precision, and automated test equipment (Stark, 1999). With the cost of these integrated 
circuit (IC) building blocks going down and reliability going up, the computation, processing, and 
communication power that can be achieved in a given device becomes overwhelming. 



Figure L Spider mite on mirror assembly/Co urtesy of Sandia National Labs. 


1 Figures provided by Sandia National Laboratories. 
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The commercial production of the first IC signaled the beginning of the silicon revolution (Tanaka, et al., 
1995). Now, there are very few areas of daily life that are not somehow directly or indirectly affected by 
ICs (Stark, 1999). In the coming decade of this new millennium, the next step in the silicon revolution 
could be the widespread use of MEMS devices in many commercial and government applications (Rai- 
Choudhury, 1997). 

Growth and development of microelectronics has been limited mostly to data processing, storage, and 
data transfer (IC domain). The next silicon revolution will take this realm beyond pure electronics and 
into this hybrid domain of mechanical systems (Rai-Choudhury, 1997). With this transition, chips of 
tomorrow will transcend the plain electronics domain. Figure 2 shows a MEMS gear designed to 
perform mechanical work. 

The concept of creating micromachines was first described in 1959 by R. Feynman in his famous papers 
that are considered the founding documents for MEMS (Rai-Choudhury, 1997). In addition, less than 10 
years after the invention of the IC, H.C. Nathanson used a microelectronic fabrication technique to make 
the world’s first micromechanical device (Rai-Choudhury, 1997). Two decades ago, the ability to use 
silicon for microscopic machines was further described in a seminal paper by K. Peterson in 1982. 

MEMS technology has become one of the most promising emerging technologies because of its potential 
to significantly alter many applications. MEMS technology is receiving substantial support for research 
and development throughout the world and goes by several names, such as mechatronics, microsystems, 
and micromachines. MEMS will likely enable vast improvements in sensing and control in automotive, 
medical, space, military, telecommunication, computing, environmental, industrial, and recreational 
applications (Mehregany, 1993). 

MEMS will miniaturize traditional systems by several orders of magnitude. For example, with this 
technology, a global positioning system could be placed on the tip of a pencil or the fastest computers 
could be placed inside a wallet as a credit card. Also within the realm of possibility is the integration of 
man and machine with embedded bionics (Guckel, 1993). Given the success of the electronic 
microcircuit, it is predictable that these same technologies will bring mechanical machines to the 
microscopic world and produce similar results: low cost, high performance, and high reliability. With 
MEMS poised to do for mechanics what the transistor did for electronics, interest in MEMS research has 
dramatically increased (Rai-Choudhury, 1997). 

MEMS technology may allow free-ranging, autonomous robots to enter the microdomain and perform 
useful work like cleaning our blood veins, repairing broken nerves, repairing tiny defects in ICs, 
scrubbing internal components of a chemical or nuclear plant, or performing any multitude of other 
microdomain tasks (Rai-Choudhury, 1997). 

With the integration of sensing, actuation, and signal proceeding into a single miniature solid-state 
device, MEMS devices can operate at low power and be manufactured at low cost. These capabilities 
will allow entirely new solutions to be devised, such as miniature weather stations and microanalytical 
instrumentation (Malafsky, 1996). 
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Figure 2 . Precision MEMS gears/Co urtesy of Sandia National Labs. 


MEMS can be made cheaply because they build on the knowledge, experiences, and infrastructure of the 
existing IC manufacturing field (Rai-Choudhury, 1997). The general manufacturing process of ICs, by 
successive deposition, photo patterning, and then etching of thin films on silicon, is directly translated to 
the MEMS manufacturing world (Sze, 1994). In the area of MEMS, these same IC fabrication sequences 
are used to etch mechanical and electrical structures. 

Additionally, batch fabrication has also reduced the unit cost of IC chips. When ICs are batch-fabricated 
with no individual assembly or manipulation required, the cost of building just one or a million 
transistors on a single wafer is essentially the same (Sze, 1994). Due to improvements in processing 
technologies, research and development of micromechanical devices has exploded since the early 1990s 
(Rai-Choudhury, 1997). In the ensuing years, electromechanical systems were routinely fabricated at the 
micron scale. The result was a whole new class of sensors and actuators that perform common tasks on 
smaller scales and are readily suited for mass production (Mehregany, 1993). 

Paul Saffo, Director of the Institute for the Future, in Menlo Park, California, suggests that this 
inexpensive technology will increase overall efficiency in many different segments of our economy. For 
example, a wireless network could be cheaply and efficiently embedded in every manufacturing device at 
a plant with sensors that report back to a central unit on how well production is progressing. Saffo 
indicates that these inexpensive, but highly reliable systems could pave the way toward incredible 
manufacturing efficiencies, mass customization of goods, and "‘consumer connectivity like you never 
imagined’' (Weinberg, 1999). 
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The MEMS field has grown rapidly in the last decade and is now estimated to have a market of $6-$ 1 4 
billion. This growth is partially due to its use of the large IC manufacturing base, which allowed new 
device designs to be quickly and inexpensively built and tested (Wise, 1991). 

MEMS can be used to perform the tasks of macroscopic devices at a reduced cost and with little to no 
loss in performance. Actually in some instances, MEMS-based devices have outperformed their 
traditional counterparts (Malafsky, 1998). By using simple mechanical structures and tailoring ICs to 
suit specific tasks, designers have seen drastic reductions in device scales (size/weight). Their size alone 
makes MEMS attractive within the automotive and aerospace industries (Malafsky, 1998). But more 
promising than reductions in size, reductions in costs can provide commercial feasibility in a variety of 
applications. By combining increasing throughput with fixed cost structures, manufacturers can linearly 
reduce prices by a comparable production increase (Rai-Choudhury, 1997). 

1.2 CURRENT TECHNOLOGIES 

Understanding the stated advantages of MEMS, designers have started developing a range of products to 
suit commercial needs. The first major MEMS to gain commercial feasibility were accelerometers, 
which were pioneered to provide zero-fault airbag deployment systems (Trimmer, 1997). Widespread 
introduction did not take place until Chrysler introduced them in their American-made vehicles in 1989 
as a result of government and consumer group pressure. Integrating a diagnostic circuit into a sensor, 
engineers were able to produce a device that could not only sense acceleration but that could also detect 
internal failures. Replacing a faulty system based on ball bearings and plastic tubing that was prone to 
misfire, these new devices became the automotive industry’s standard (Payne and Dinsmore, 1991 ). 

Building from the technological, as well as commercial, success of these initial designs, engineers have 
developed a wide variety of MEMS motion sensors. Recently, research has been conducted into 
producing micro-gyroscopes as part of a fully integrated inertial reference unit. Development has also 
commenced on micro-seismometers and micro-hygrometers that could provide miniaturized weather 
stations when incorporated with accelerometers (Colclaser, 1980). 

Current MEMS work is also progressing in the microprocessor environment. Given the power 
dissipation requirements of the average current-market microprocessor exponentially increasing with 
time, research has begun to find better ways to conduct heat away from ICs (Rai-Choudhury, 1997). 
Using MEMS, it may be possible to take point contact voltages and current measurements on the 
microprocessors real time, so that active cooling can be appropriately applied (Martinez de Aragon, 
1998). 

A promising field within MEMS is optical devices where, for instance, digitally controlled MEMS 
television sets can be created. Using micro-mirrors placed on top of memory arrays, researchers have 
developed a television projection unit on a semiconductor wafer that has all the functionality of a 
traditional television tube (Helvajian, 1995). 

Mechanical MEMS sensors can be used to monitor shock and vibration in all phases of a system’s life. 
For example, as any system is being built, components and subsystems are transported between 
manufacturers, integrators, and installers. Shock and vibration damage can occur during any of these 
trips that can cause significant damage that could be sensed and recorded by embedded MEMS devices 
(Robinson, 1995). 
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For the biomedical arena, MEMS devices can be used to both monitor a patient’s physiology and to 
augment human capabilities. In fact, infusion of MEMS technology in medical applications was one of 
the earliest commercial successes. Millions of disposable blood pressure sensors are used annually 
(/£££, 1995). However, medical applications pose additional challenges to MEMS technology because 
of the need for compatibility with human biology, in some cases, long-term compatibility. These 
compatibility factors include material properties, electric hazard, energy supply, and heat dissipation. 
MEMS devices are envisioned for complex applications in sensory substitution, drug delivery, organ 
substitution, and neural interfaces (Dario, 1995). 

Specifically, in the optics area, the University of Rochester, the National Science Foundation, the 
National Eye Institute, and Bausch & Lomb are conducting joint research to develop an adaptive optics 
device that can correct visual distortions in the eye. With this technology, subtle imperfections that were 
even unmeasurable just a few years ago can be corrected. Correcting these imperfections, even in a 
person who has 20/20 vision, can result in greatly improved vision. It may be possible to correct 
anyone’s vision to 20/10. Looking through an adaptive optics device, everything becomes sharper and 
clearer. Specifically, imperfections are corrected with MEMS mirrors that can bend and customize the 
shape. The subtle shaping, done in response to the customized measurements of the individual’s eye, 
alters the light in such a way that it exactly counters the specific distortions of the person’s eye 
(Williams, 2000). 

MEMS devices are also being developed for many commercial and government transportation uses. 

These functions can be grouped into four main areas: guidance and control, propulsion and power, 
communications, and sensing (Kukkonen, 1997). Sensing capabilities that can use MEMS technology 
include pressure, hygrometer, wind velocity, mass spectrometer, optical spectrometer, and chemical 
analyzers. For guidance and control, MEMS accelerometers, gyroscopes, magnetometers, and microflaps 
will be required for system development. Micro-thrusters and micro-thermoelectric and photoelectric 
generators will be needed for development of MEMS-based propulsion and power systems (Malafsky, 
1998). In addition, MEMS sensors can be used to measure, for example, a given system’s performance; a 
patient’s physiology; or even planetary and meteorological sensing (Kukkonen, 1997). 

In the fast-growing area of transportation, inertial guidance units (IGUs) can be miniaturized with MEMS 
technology. An IGU is composed of both gyroscopes to measure angular motion and accelerometers to 
measure linear motion. The accuracy required of the gyroscopes and accelerometers depends strongly on 
the application (George, 1998). The most demanding applications, such as in submarines and 
intercontinental ballistic missiles, require extremely low drift rates because of the long mission time and 
the growth of error with time squared (Yazdi, 1998). 

Another area in which MEMS research and development is rapidly progressing is space — where low- 
cost, high-reliability, small-size, low-power MEMS can have dramatic benefits (Malafsky, 1998). NASA 
hopes to eventually replacethe large satellites that explore our solar system and beyond with miniaturized 
spacecraft (Malafsky, 1998). With every pound sent to Mars costing upwards of one million dollars 
(considering development, launch, operational costs, etc.), the potential of sending a fully integrated 
spacecraft weighing just a hundred pounds, instead of several thousands, offers substantial benefits 
(Stark, 1999). This is vital considering the current federal budgetary constraints. In addition, by using 
MEMS technology, NASA will be able to embed many varying systems into one mission, thereby 
gaining more science with the same investment (Malafsky, 1998). 
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Active control of aircraft and spacecraft is also possible with MEMS devices. A MEMS device using an 
on-chip actuator as a microflap can control the turbulent flow over a wing. Also, an on-chip shear stress 
sensor can monitor the flow dynamics. With integrated electronics, these sensors could provide the 
analysis and feedback control to the microflap (de Groot, 1998). 

Spacecraft development could significantly benefit in many ways from the infusion of MEMS 
technology. With the multidisciplinary approach to MEMS development and incorporation, complete 
spacecraft that are entirely composed of MEMS systems could soon be created and deployed. 

1.3 PROBLEM DESCRIPTION 

An important part of any development process is being able to quantify the reliability of the device at the 
conceptual design phase. At the conceptual design phase of a project, before the MEMS devices are 
actually built and tested, traditional methods of quantifying reliability are inadequate because the device 
is not in existence and cannot be tested to establish reliability distributions. Design engineers require 
amethodology for estimating MEMS reliability. Within this research, a novel approach using neural 
networks was created to predict the overall reliability of a MEMS device based on the device's attributes. 

Since MEMS research is still in its infancy, the need for defining issues and developing reliability tools 
is critical. The goal of this research was not just to provide reliability modeling techniques for system 
implementers, but also to provide an analysis tool for developers at the conceptual design phase of a 
MEMS project. Given the commercialization of MEMS, reliability issues (which have been previously 
overlooked) will become one of the main emphases of MEMS research. To ensure commercial 
feasibility, reliability issues must be raised in unison with the development of MEMS. 

In confronting the issues of MEMS reliability assurance, developers will certainly have different 
requirements. For example, a crewed Mars mission will have a different set of requirements and 
specifications than an electronics device designed for home use, but there will be similar methodologies 
for assessing and quantifying the reliability of both. This research is designed to use basic similarities in 
design requirements to provide a means of developing MEMS reliability modeling. To quantify the 
reliability of a MEMS component, we must consider not only the device itself, but the entire process 
surrounding the part, from conception, design, fabrication, testing, and packaging schemes, and 
ultimately to the environment in which the device will operate. This means that the development process 
must be qualified and effectively modeled, including the fabrication process, quality standards, and 
fabricator’s experience. In addition, the design must be verified, and the packaging certified. 

A goal of this research is to develop a technique to quantify overall risk and reliability of a proposed 
MEMS device before it is actually created. To guide MEMS process development through reliability 
evaluations, we must quantify MEMS reliability by evaluation and analysis of devices, test structures, 
and materials. This reliability estimate must be based on data available at the conceptual design phase of 
a project — data about the fabrication process, design characteristics, physical attributes, and performance 
expectations from the device, including parameters related to the operating environment. Neural 
networks may provide an ideal mechanism to translate these attributes into a predictive reliability 
estimate. 
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1.4 A PROPOSED SOLUTION 

The objective of this research was to provide reliability modeling techniques for MEMS devices at the 
conceptual design phase using neural networks. The general methodology for quantifying reliability of a 
MEMS device is as follows. First, attribute data (those that do or might have a correlation to overall 
reliability, i.e., fabrication process details, physical specifications, operating environment, property 
characteristics, or packaging) and reliability data are collected for MEMS devices. These data are 
randomly partitioned into training data (the majority) and validation data (the remainder). A neural 
network is then applied to the training data (both attribute and reliability data are used to train the 
networks) — the attributes eventually become the system inputs and reliability, the system output. During 
the training process, the neural networks will find the correlation between the attributes and the 
reliability estimate. After the networks are trained, the validation data are used to verify that the neural 
networks provided accurate reliability estimates — independent validation that the neural network is 
accurately predicting reliability. Now, reliability of a new proposed MEMS device can be estimated by 
using the appropriate trained neural networks. 

In addition, these neural networks can be used in the design process to optimize the overall reliability, 
since the networks can provide insight on what design, fabrication, and operating attributes are 
significant determinants of overall reliability (can easily perform sensitivity analysis with the results of 
the modeling). 

1.5 INTEGRATED MEMS EFFORT NEEDED 

Large MEMS efforts are under way in the Department of Defense, Department of Commerce, NASA, 
Department of Energy, and in the European Space Agency. In some cases, NASA has already started 
collaborative relationships with these other agencies (Malatsky, 1998). 

There are many roles for corporate and government agencies to fill in the MEMS technology field, 
including basic research and development, technology prototyping, field-testing, and operational use. All 
of these efforts will help MEMS reach its potential and promise. 

Despite the many successful prototypes, MEMS devices must still make the difficult transition from 
research and development to a completed product. This transition introduces several new issues that 
must be addressed. Products must not only satisfy an operational need, but must be functionally reliable, 
withstand the rigors of deployment, maintain sensitivity and resolution in an operational setting, and be 
manufactured at a competitive cost (Malafsky, 1998). MEMS reliability estimation and modeling is a 
key portion of this effort. 

1.6 RESEARCH OVERVIEW 

The purpose of the research is to develop and investigate the feasibility of creating a predictive tool for 
MEMS reliability using neural networks. This research emphasizes reliability estimation and prediction 
for MEMS devices at the conceptual design phase where traditional methods to quantify reliability are 
infeasible. A new approach using neural networks was created to predict the overall reliability of a 
MEMS device based on its attributes including design criteria, physical specifications, fabrication 
method, packaging of the MEMS devices, and details of the operating environment. The developed 
neural network heuristic will minimize the error in estimating the reliability of a MEMS device by 
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mapping these selected attributes to a reliability value. The neural network model will reveal any 
correlation between the attributes and reliability. 

Section 2 will analyze the problem and discuss the methodology used to derive a solution. Specifically, 
the approach of using neural networks will be detailed with discussions into the different types of 
modeling networks that are used in this research. Section 3 presents the results of modeling MEMS 
reliability with neural networks. Also, the feasibility of this approach will be discussed. Finally, Section 
4 will summarize the research and draw conclusions from the modeled data. In this Section, any areas 
that could be further researched will be outlined. The Appendix contains all the raw data that were used 
to train and test the neural networks. 
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SECTION 2: METHODOLOGY 


The research will establish a reliability estimation and prediction scheme for MEMS devices at the 
conceptual design phase using neural networks. At the conceptual design phase of a project, before the 
MEMS devices are actually built and tested, traditional methods of quantifying reliability are inadequate 
(device not in existence and cannot be tested to establish reliability distributions). A novel approach 
using neural networks will be used to predict the overall reliability of a MEMS device based on its 
components and each component’s attributes. 

The model will extrapolate reliability from previously tested, but similar, MEMS devices. High-level 
system attributes that will be modeled include design attributes, physical characteristics, material 
property characteristics, fabrication environment, fabrication technique, quality level, testing and 
validation level, packaging, and the environment the device will be used in. A good modeling scheme 
must have the characteristics shown in Table 1 to provide acceptable results. 

• Dynamic — the model must be able to adapt and change as new information is added 

• Robust - the model must function in areas outside of the input data regime (training sets) 

• Relevant - the model must provide information that is both informative and accurate 

• Objective - the model must not be too reliant on subjective criteria 

• Comprehensive - the model must provide an accurate and complete picture of the 

relationship between the input parameters and output 


Table 1. Model Criteria 

2.1 GENERAL MODELING APPROACH 

The general approach to developing neural networks to predict MEMS reliability consists of 
decomposing the system to its component level (gears, gyros, springs, etc.), then selecting which MEMS 
component attributes have a correlation to its component reliability. For this analysis, humidity, 
operating frequency, resonant frequency, spring quotient, and force component were all selected as 
MEMS microengine attributes to be modeled. Due to the limited access to sufficient data, only this set 
was initially used. However, in subsequent research, a more comprehensive set of MEMS attributes 
should be tested and modeled. Next, data on the selected attributes and the overall component reliability 
(failure times) are collected through a systematic testing approach. In total, 787 MEMS microengines 
were tested and used in this research. The failure data are collected and then segregated into similar 
sets — data whose input MEMS attributes are similar. These groupings of data are then individually fit to 
different types of probability distributions to evaluate the best fit. The most accurate probability 
distribution for the MEMS microengine failure data will be used as the model output. 

While evaluating the different probability distributions, we observed that some of the data demonstrated 
bimodality. To accurately model this feature, the output of the network was modified to accommodate 
two distributions (labeled the upper and lower). For those groupings of data that were unimodal, the 
distribution was duplicated for both the upper and lower output parameters during the training phase. 
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The microengine failure data are collected and then transformed into a format compatible to the modeling 
software. For instance, all parameters must be in a numeric form and must be modified accordingly. 
Feature extraction and other forms of data manipulation are employed to enhance the modeling results 
(see subsequent sections for more details on these processes). 

A neural network modeling software is then used to create the neural networks. We selected AbTech’s 
Model Quest Expert software, which is commercially available, because of the variety of the modeling 
schemes it uses and for its adherence to the modeling criteria listed in Table 1 . The transformed data is 
randomly partitioned into two sets (training and validation). The training set of data is entered into 
AbTech’s software to build the neural networks. Once the trained networks are constructed, the trained 
neural networks use the validation set to determine model performance. The software applies the 
validation data to the trained network to predict the failure distribution (note, during testing with the 
validation data, only the input data is provided to the model). After the software predicts the failure 
probability distribution, it is compared to the known probability distribution. Specifically, statistical 
parameters (standard deviation, R 2 , etc.) are calculated to compare the predicted values to the known 
values for each different type of neural network being evaluated (Statistical Networks, K-Nearest 
Neighbors, Regression Analysis, Decision Tree). Finally, the effectiveness of each modeling technique 
is evaluated and the best modeling approach is selected. 



Figure 3. Methodology for developing neural nets to predict MEMS reliability. 
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In Figure 3, the general methodology for predicting MEMS reliability using neural networks is shown. 
There are three main phases to this process: training, validation, and use. The first step is to gather 
information on MEMS data along with the reliability values obtained through testing. These MEMS 
devices will then be decomposed into component levels (i.e., gears, gyros, and springs), and attribute data 
(input) and component reliability data (output) will be compiled to develop the neural networks. The 
reliability data for each type of component will be fit to a reliability distribution, and the characteristic 
coefficients (e.g„ p and a for Normal, or a and (3 for Weibull, etc.). This accumulated data will be split 
into two groups, the majority into the training set (which will be used to train and develop the neural 
nets) and the validation set (which will be used to independently verify that the developed neural nets are 
accurately predicting the component reliability). Different neural networks are trained and tested for 
each type of MEMS component (gears, gyros, springs, etc.). 

Once the training set has trained the neural nets for each type of component, the validation set is used to 
verify that the neural net is estimating the component reliability. After proper validation, the trained 
neural networks can be used as a predictive tool for MEMS reliability. 

Neural networks are much more than gathering a set of raw data and feeding it directly to a modeling 
algorithm. Success requires a sequence of coordinated steps. The process of developing neural networks 
to predict reliability of MEMS follows the sequential steps of (i) identification, (ii) transformation, (iii) 
model, and (iv) analysis. These steps are further analyzed in the following sections. 

2.1.1 Identify 

This is the step in the process for identifying and characterizing the data. This is a critical step in the 
modeling process because the results are so dependent on the quality and selectivity of the input 
parameters. Several issues are present in this step and are outlined subsequently. 

2.1.1.1 Data Set Identification 

The first priority is to determine what data will be used to build the models (“training’ data), and 
determine how well the chosen model works (“validation” data). When testing the effectivity of the 
models, it is extremely important to have an independent data set that contains examples that were not 
used to train the models, that is why a portion of the data (randomly selected) is set aside for validation. 
This verifies the ability of the models to work well on new, unseen data, as they must when they are 
implemented for actual reliability prediction. 

2.1. 1.2 Variable Selection 

Once the data set has been identified, it is necessary to determine which of the data fields will be used for 
predictors (inputs) and which parameter will be predicted (output). The inputs are sometimes called 
independent variables, and the output is called the dependent variable, since its value is driven by the 
values of the other fields. The format of the output variable will directly affect which modeling approach 
is used. New input variables can be created from existing variables to create more powerful modeling 
(for more information on this process, see Section 2. 1 .2.3, Feature Extraction, below). 
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2.1. 1.3 Data Inadequacies/Improvements 

The raw data often are not ready to be modeled because of data inadequacies. Some of the common 
problems encountered with data to be modeled with neural networks are discussed below. All of these 
issues will be addressed when the MEMS data are transformed for neural network modeling. 

Format - Data may be in text, date, or some other nonnumeric format. Most neural network algorithms 
only deal with numeric fields. For example, while an input parameter to be modeled may have values of 
“yes” or “no,” these would have to be changed to 1 ’s and 0’s to be compatible with the modeling 
techniques. 

Representation - In some cases, it may be necessary to represent the data in a different manner. This is 
often the case when dealing with categorical or nominal variables that do not have a natural ordering. 

For example, the fabrication process may be an important input variable. Instead of modeling the 
fabrication process of bulk micromachining, surface micromachining and mold micromachining as 1,2, 
and 3, it may make more sense to represent this one variable in three separate input binary fields 
(separate one for each different micromachining process). Therefore an unintended sequential relation 
between the different fabrication methods is not modeled. 

Null - Most neural network techniques do not deal with null values, where data are missing. There are 
different methods for dealing with nulls during the transformation step of the modeling process. It may 
make sense to fill in the average of the variable for any missing data, or delete any record with a null, or 
possibly to interpolate the value based on neighboring records for time-series data. 

Feature - There may be known relationships that are important to modeling an output that are not 
represented in the original data set. For example, the operating temperature may be an important 
variable, where the maximum and minimum temperatures are known and defined as input variables. In 
some instances, the difference in the maximum and minimum temperatures can provide additional clarity 
to the model, that the other two alone may not provide. This feature can easily be added to the original 
variables by subtracting the two temperatures for each record and setting this value as a third input 
parameter. Some modeling techniques, by their nature, are more adept than others at automatically 
figuring out these important but simple relationships. These methods of feature extraction enable a more 
robust model. 

Data DistHbution - Models can often be improved by ensuring that they are getting representative 
example data. Some of the common distribution problems include: 

• Distribution of training data - sparse input data regions: 

In cases where certain regions of the problem are not well represented (sparse regions), certain 
sampling techniques can help compensate, for instance, over-sampling. This will ensure that the 
whole modeling regime is well represented in the modeling process. 

• Distribution training data - skewed representation of output cases: 

While good representation of data from an input standpoint is important, output data distribution is 
equally important. Often a database may appear to be well distributed; however, when looking at the 
distribution of its outputs, a skewed representation may exist. If most of the training examples 
occupy a small subregion of the entire problem domain, the resulting model will perform much better 
in this small subregion and performance will likely suffer elsewhere. Again, sampling techniques 
can be used to over-sample in these sparse regions. 
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• Outliers: 

Outliers are data examples that fall far outside the majority of the database. Outliers represent either 
valid data points that are simply anomalous situations or may identify areas in which the raw data 
was incorrectly produced or recorded. 

The presence of outliers in training data can skew a model to “capture’ the outlier, and can skew 
performance results. Approaches to dealing with outliers include eliminating them from the 
database, over-sampling to create additional examples in the sparse region of the outliers and training 
multiple models for the more heavily populated regions and different models for the sparse regions 
containing outliers. 

• Differences between training and validation data: 

For example, if the data set is partitioned into two sets — one for training and one for validation — it is 
important to verify that the two data sets are characteristically similar by comparing their statistical 
parameters. If the two databases are significantly different (as indicated by their means and standard 
deviation, etc.), then the model is being tested with data statistically different than with which it was 
trained. Therefore, care must be taken when partitioning the data to ensure the similarity of the two 
data sets. 

2.1.2 Transform 

Properly representing and transforming data can make the difference between success and failure in the 
modeling process. There are several different approaches to coding and representing data so that certain 
characteristics are more obvious to the subsequent modeling algorithm. 

2.1.2.1 Data Coding and Representation 

For modeling which requires a numeric data form, the manner in which symbolic data is converted from 
a symbolic form to numeric form is critical. In general, variables that have symbolic values in their raw 
form fall into two categories: ordinal and nominal. 

Ordinal - There is a logical, sequential ordering to the variable. Examples include operating temperature 
(very cold, cold, room temperature, wami, and hot) and many types of ratings (excellent, good, fair, 
poor). In this case, an integer value can be simply assigned to the original symbolic values (excellent = 

4, good = 3, etc.), which will capture the ordinal nature of the data field. 

Nominal - There is no inherent ordering; nominal values simply imply labels or states. The different 
symbolic values merely represent different cases that cannot be compared to one another on any logical 
scale — the ordering of the values is irrelevant. An example of a nominal variable is religion. Assuming 
no bias, a model based on Catholic = 1, Muslim = 2, Hindu = 3, and Buddhist = 4 will be equivalent to a 
model with the labels renumbered. Thus, assigning sequential integer values would be incorrectly 
implying to the modeling algorithm that examples with higher values were somehow “more or less of 
something” in a physical concept. In actuality, different symbolic values for nominal variables simply 
indicate different cases, and do not imply any relative importance. 

Another class of variables that require numeric coding is cyclic variables, such as Day of Week or Day of 
Year. Cyclic variables cannot be coded numerically with sequential integers due to the discontinuity at 
the ends of the scale. For instance, an integer coding of the symbolic variable Day of Week where 
Sunday = 1, Monday = 2, etc., is incorrect since there is a discontinuity between Saturday and Sunday. 


13 



The fact that Monday follows Sunday is represented by the fact that 2 follows I. However, Sunday 
follows Saturday, but 1 certainly does not follow 7. 

Coding cyclic parameters using dummy variables can overcome this discontinuity. For the variable Day 
of Week, seven new dummy variables would be created from the original variable, thereby eliminating 
the discontinuity at the Saturday/Sunday transition. However, the cyclic nature of this variable is lost in 
this approach. A different approach is to represent the days of the week along a unit circle in two- 
dimensional Cartesian coordinates. Each value would be mapped to a (x, y) location along the circle, 
each space 360/7 (51.4 deg apart). Thus, if the variable Day were represented originally in a 1 to 7 
format, then for each record we would convert: 

Day_x = cos* l [(360/7)(Day)] 

Day_y = sin~ l [(360/7)(Day)l 

This method preserves the cyclical nature of the variable, with consecutive days being closest to each 
other, without tripping over the week transition. 

2.1.2.2 Data Sampling 

Data sampling is used in situations where certain portions of the database are either under- or 
overrepresented. Sparse and/or underpopulated regions will tend to bias some of the modeling schemes. 
Data sampling simply duplicates data examples according to predefined criteria. It is often beneficial to 
add noise when duplicating data, since it adds robustness to the model. 

2.1.2.3 Feature Extraction 

As discussed earlier, the transformation of the data before modeling is often the most critical step in 
using neural networks. Another critical step in the modeling process is extracting new features from the 
data to use as input variables. 

Feature extraction transforms raw data into a more useful form by creating new input variables from 
existing variables. It is important to realize that feature extraction does not “create” new information. 
Rather, feature extraction “massages” the information in the raw data and presents it in a new light to the 
modeling algorithm. Feature extraction is an extremely useful method for evaluating what is known 
about a problem. This prevents the learning algorithm from having to determine important relations in 
the data that are already known. 

Sometimes, the characteristic that a particular feature extraction algorithm captures has some physical 
significance. Often however, it is difficult or impossible to attach real-world meaning to a specific 
feature. Although a feature may not have physical significance or meaning, it may still be useful for 
modeling the patterns and trends in the data. 

For static decision problems the most common form of features are transforms of existing variables. An 
example of a single-variable feature is the natural logarithm of an existing variable. Logarithmic features 
are often useful for reducing the dynamic range of variables and to transform the exponential nature that 
sometimes exists in the data to a more linear form, which is easier to model. 
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2,1.3 Model 


Once the data have been preprocessed and placed into the proper formats, they are ready to be mined for 
information. The neural networks models are trained to classify or estimate outputs. Several different 
mining schemes should be evaluated to determine which neural networks provide the best performance 
for the given type of data. 

The Model step consists of defining neural networks for the selected problem type. This involves: 

1 . Designating the inputs and the outputs to the model 

2. Identifying the training and validation sets 

3. Selecting the mining strategies, as well as the modeling parameters 

4. Executing the resulting model 

5. Analyzing the resulting models 

6. A pplying the best mining strategy to subsequent data 

Table 2. Modeling Steps 


2.1.4 Analyze 

When analyzing the results of the modeling, it is very important that the performance of any model be 
determined with data that were not for training. Testing models on unseen data more closely represents 
the manner in which the model will be used in practice (i.e., on data that were not used for training) and 
is therefore a more realistic evaluation approach. 

2.1.4.1 Estimation Problems 

After modeling with neural networks, error statistics should be calculated to determine a comparison 
measure of how well each model is working. The error statistics are calculated by subtracting the model 
estimate from the actual value of the output to determine the error for each example. Then, aggregate 
statistics can be calculated that describe how well the model performed on the data sets. The following 
types of error measures will be calculated to determine a comparison of how well the different modeling 
schemes are working: 

Average Absolute Error - This is an average of the absolute error of each sample. This evaluation 
criterion measures the overall accuracy of the model. 

Maximum Absolute Error - When large individual errors are intolerable for critical systems, this is a key 
evaluation metric that should be minimized. 

Standard Deviation - This metric is a measure of the variance of the error. The larger the variance of the 
error, the less consistent the model is over all ranges of values. This should be minimized and looked at 
in conjunction with the previous two statistics. 

Coefficient of Determination (R 2 ) - This metric is a measure of the correlation between two data sets, or 
between the model estimates and the actual values. represents the proportion of variation in the 
dependent variable that has been explained or accounted for by the regression equation. The R value 
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may vary from zero to one. R : = 0 indicates that none of the variation in Y is explained by the regression 
equation; whereas R : = 1 indicates that 100% of the variation of Y has been explained by the regression 
equation. 

It is also often useful to graph the actual values versus the model estimate values, or the actual values 
versus the errors to see if there are larger deviations based on the actual value. 

2.1.4.2 Classification Problems 

Some of the metrics used for estimation types of problems can also provide knowledge for classification 
problems. However, it is usually more productive to look at the actual classification statistics, and 
minimize the number of incorrect classifications. 

The key to any machine-learning strategy is the learning algorithm itself. It must be able to generalize 
from, and not memorize, numerical examples of a problem domain. The model should discover 
relationships found within the data to perform well for not only the training data but also independent 
(i.e., real-world) data. The main reason for this requirement is that all data contain uncertainty. Noisy, 
missing, conflicting, and erroneous data are manifestations of uncertainty in numerical examples. 

An effective machine-learning algorithm must learn relationships and avoid memorizing noise. And to 
be practical, it must achieve these goals in an automated manner. 

2.2 SANDIA NATIONAL LABORATORIES MICROENGINES 

The one obstacle to this research is the lack of data, both quantity and quality, that are needed for 
adequate training of the neural networks. There is very little available data on MEMS reliability since 
most commercial manufacturers consider their reliability data proprietary. Most universities and 
research institutions do not have the quantity of similar data required to adequately model with. 

However, Sandia National Laboratories in Albuquerque, New Mexico, has been manufacturing MEMS 
components for several years. Sandia is very interested in MEMS technology for applications on missile 
arming systems. Sandia is emphasizing MEMS research because these devices have high reliability, low 
power consumption, and small size and weight. 

Sandia has shown a lot of interest in the novel approach developed in this research and has graciously 
made all their reliability data available for this research. Most of the reliability data are from the same 
basic MEMS design, with only minor design and operating environment parameters varied. Even though 
this is a limited test sample, it may provide an excellent basis to determine the feasibility of this modeling 
approach. Figure 4 shows a view of Sandia’s microengine. 
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Figure 4. Photomicrograph of Sandia microengine/Courtesy of Sandia National Labs. 

The Sandia microengine operation is fairly simple. It uses an electrostatic comb drive supplying 
alternating currents to the fingers of the comb drive. First an electric charge is sent to the upper comb 
elements that pull the drive up. Then this charge is released and the mechanical flexture (“restoring 
springs”) of the beam pulls the comb drive down to its neutral position. Next a charge is placed on the 
bottom comb elements that bring the drive further down. This charge is then released and the comb drive 
rises back to its neutral position. This sequence is then repeated in a coordinated fashion to drive the 
shuttle in harmonic motion. Like any other mechanical oscillating system, these microengines have a 
resonant frequency. The testing was done at frequencies above and below this resonant frequency. The 
two shuttles (perpendicular to each other, labeled “X” and “Y” on Figure 5) drive the pin-jointed wheel, 
which is connected through a hub. This wheel can then be used to drive a transmission or other 
mechanical system (see Figure 5). 



Figure 5. Sandia microengine annotated /Courtesy of Sandia National Labs. 
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2.2.1 Sandia’s Reliability Test Equipment 


To collect large amounts of reliability data, Sandia has developed a method to test multiple devices 
simultaneously instead of testing each device individually. This methodology enables testing of large 
amounts of MEMS devices in an efficient manner. Sandia Labs has developed a multipart MEMS test 
station, known as SHiMMeR, (Tanner, 1997). 

Figures 6 and 7 show inside and outside views of this system. The SHiMMeR system allows testers to 
optically inspect the test articles for functionality through a series of electrical and optical subsystems. 
The electrical subsystem allows user-defined electrical signals to be sent to each test article (the 
packaged MEMS parts being tested). The drive signals are sent to all of the MEMS devices. This whole 
process is self-contained in an automated package that makes the whole testing sequence fairly easy 
(Tanner, 1997). 



Figure 6 . ShiMMeR (inside )/ Courtesy of Sandia National Labs. 


The SHiMMeR system also has an optical subsystem including a microscope and camera which steps 
from part to part to inspect the functionality of each of the test articles. 



Figure 7. ShiMMeR (outside)IC ourtesy of Sandia National Labs. 
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Each test bed consists of a 4 x 2 array of printed circuit boards with up to 64 packages with a total of 
256 parts (the current configuration has four microengines per package), see Figure 8. This arrangement 
of multiple small printed circuit boards rather than one large board provides great flexibility in the 
arrangement, device wiring, and signal optimization of MEMS devices under test (Tanner, 1997). 

The fully computer-controlled system allows for the images to be captured at very precise instances in 
time. This test equipment was used to test 787 Sandia microengines under varied conditions so that the 
microengine’s reliability could be modeled. The raw data from these tests are shown in the Appendix. 



Figure 8 . A Sandia microengine package (4 engines)/Cowriesy of Sandia National Labs. 



SECTION 3: RESULTS AND DISCUSSION 


There are several different failure modes existing in the Sandia microengines. Out of all the failure 
modes found, the predominant mode is wear. The close rubbing surfaces (0.5 microns or less) of the pin 
joint and hub region in the Sandia microengines create sufficient wear, which leads to failure over time. 
Wear debris can jam gears or actuator arms leading to sticking and rocking of the microengines. Also, 
wear particles can short electrical components and cause failure of the microengine. Or, worn 
components like pin joints can rupture and come undone after gradual degradation. In addition, particle 
contamination (an insufficient clean room, or debris from the cutting process) during wafer dicing 
(cutting the wafer into individual MEMS chips) can create similar problems for the microengines 
(Tanner, 2000). 

Stiction (adhesion of the moving parts) is another primary failure mode in the microengines. It results 
from the capillary forces that exist between the microscopic parts and liquid remnants from the drying 
process. Surface coatings, super-critical drying, and the use of dimples can mitigate the onset of stiction. 
Stiction, in its worse form, can lead to the fusion of components. For instance, the high voltages used in 
the comb drives can cause arching between the comb elements that result in a permanent weld. Guides 
that prevent moving parts from actually touching can minimize these problems (Tanner, 2000). 

Surprisingly, fatigue, fracture and corrosion are insignificant sources of failure in the Sandia 
microengines. The most common failure modes for the microengines are summarized above. All others 
only contribute minimally to microengine failure. The reason that the microengines are more resilient 
against fatigue, fracture, and corrosion may be that the underlying building material for these engines is 
polysilicon. Polysilicon is self-healing and will bond and repair itself as cracks form. In addition 
polysilicon is not susceptible to creep. Fracture is only seen when the wear has thinned a structure (e.g., 
motor hub) to the point that a crack induces catastrophic failure. 

3.1 MEMS DATA 

The MEMS microengine data collected from Sandia (787 microengines, shown in the Appendix) were 
used to train several different types of neural networks. The network was created to predict not just a 
specific failure time (point value), but instead a whole probability distribution for failure times. The 
greater resolution obtained with an entire distribution has more utility in concept analysis than a mere 
random failure time. Therefore, each set of microengine data (those with common sets of input 
parameters) was individually fit to separate probability distributions. After trying several different 
distributions, the log-normal distribution provided the best fit to the microengine failure data. The reason 
for this may be that the log-normal distribution for semiconductor devices has been realized and 
empirically demonstrated for some time (Howard and Dodson, 1961 ). Its acceptability as a failure 
distribution was shown by the life -test sampling plans that were developed for it (Gupta, 1962). 
Therefore, it is logical that the log-normal would provide a good fit for the data. The log-normal is a 
two-parameter distribution consisting of tso, the median cycles of failure and the characteristic shape 
parameter, <T. 
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3.2 BIMODAL DISTRIBUTIONS 


Upon closer inspection of the data, some ot the resulting distributions showed a bimodal tendency the 
distribution has two regions of data concentrations. Specifically, these two modes or distinct humps in 
the bimodal distributions reflect the relatively high frequencies of the two separate clusterings of data. 
This bimodality of some of the data must be modeled. This was achieved by modifying the output 
domain of the networks. The output, instead of containing two parameters of a single distribution, was 
modified to cover two separate distributions, which would then be combined into one distribution 
through a weighting scheme. 

It is interesting to note that, for all the data sets that were described by bimodal distributions, the values 
of o for the two corresponding modes were similar. The closeness in the value of these two o s is 
probably indicating that the underlying failure modes are the same. Inherent differences between the 
parts may cause the differences between the two population means. For instance, earlier failure times 
could be for the weaker parts and longer failure times for the stronger parts (Tanner, 1999). There can be 
a degree of variability (small differences or aberrations in the silicon crystal or small amounts of defects 
in the etching process, etc.) between the MEMS microengines, even though they are batch-fabricated in a 
no-touch, automated environment. Thus, some of the microengines may be naturally weaker than others, 
even though they are created under the same process. In addition, the drive signals that have been 
devised were optimized for a sample set of microengines; therefore, any subtle differences (differences in 
the resonant frequency, etc.) can lead to undesirable loading of the microengines during testing and a 
subsequent premature failure. 

To account for the possibility of bimodal distributions in the output of the neural networks, we defined 
the output parameters to always contain two distributions — labeled the lower and upper distributions. In 
essence, this would require four parameters (two for each log-normal distribution); these were labeled the 
lower t M , and O, and the upper t M > and O. Since two distributions were intentionally defined as the 
outputs from the network, if a distribution is unimodal, then the two parameters of the unimodal were 
duplicated for both the lower and upper parameters during training of the networks. If the distribution is 
bimodal, then the two modes are partitioned into the upper and lower parameters and trained 
appropriately. 

After the networks are trained, if the output medians from the network are distinctly different (greater 
than one standard deviation apart), then the output should be considered two separate distributions 
(bimodal). However, if the medians of both distributions are within one standard deviation from each 
other, then the medians and shape parameters should be averaged and taken as one unimodal distribution. 
If the output is bimodal, a weighting system could be used to determine the influence that each 
distribution has on the combined bimodal distribution. This weighting system could be devised by 
determining the relative counts of each grouping in the training set — what population percentage is 
represented by each of the upper and lower regions. 

3.3 TRAINING THE NETWORKS 

Table 3 summarizes all the microengine failure data after they were condensed into separate distributions 
covering the different testing conditions. Sandia did not collect many parameters during the testing 
phase; therefore the input parameters are somewhat limited. However, of the parameters collected, 
several were key determinants of reliability. Humidity, the operating frequency (f ), the resonant 
frequency (/“,), the ratio of the latter two (///,,), the spring quotient, and the tangential force component 
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imparted to the drive gear were collected and modeled (these parameters should all influence 
microengine reliability). 

The condensed data were directly fed into the neural networks to train the prediction scheme. Six 
different neural networks were trained and then all compared to determine which networks provided the 
best results (these six neural network algorithms were discussed in detail, previously). The Error 
Knowledge Network (K-Net), Hybrid Knowledge Network (K-Net), StatNet, and StatNet Selected Inputs 
(all forms of statistical networks) consistently showed the best prediction capabilities for the specific 
MEMS training data. 


Model Inputs 

Model Outputs 
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Table 3. Parameters Used to Train Networks 


Tables 5 through 8 show the comparison results (how well each network performed at predicting reliability) 
from each of the different networks. Since four different output parameters were being predicted, each 
output was compared separately. Table 5 shows the statistics for lower tv> prediction, Table 6 statistics for 
lower G, Table 7 statistics for upper tsu, and Table 8 statistics for upper CT. 

3.4 COMPARING THE DIFFERENT NEURAL NETWORKS 

The main comparison parameter used to evaluate the effectivity of the modeling is the Pearson 
correlation coefficient, also known as l~. We used this measure since it provides a good measure of how 
well each network correlates the inputs to the outputs. The ris a statistical procedure that assesses the 
strength and direction of the relationship between two different sets of parameters (between the inputs 
and output in the current modeling scheme). 
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The coefficient yields a single number that can have a value between 0.0 and 1.0. The closer the value is 
to 1.0 the stronger the relationship, conversely the closer the value is to 0.0, the weaker the relationship. 
Table 4 suggests a qualitative meaning for ranges of coefficient values — the strength of association given 
by the values of the coefficient. Any value over 0.80 indicates a strong association between the variables. 


r 2 

Indicator of 

0.80-1.00 

Strong association between variables 

0.60-0.79 

Strong-moderate association 

0.40-0.59 

Weak -moderate association 

0.20-0.39 

Weak-weak association 

0.00-0.19 

Little, if any, association 


Table 4. Pearson Correlation Coefficient Values 


The formula to calculate r is as follows (Lane, 2000): 
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Figure 9. Equation for Pearson correlation coefficient. 

Another comparison statistic used is the Absolute Error, which is the absolute value difference between 
the estimated values and actual values. Other parameters used in the analysis include the corresponding 
maximum value, the average, and standard deviation of the Absolute Error. We also used the Squared 
Error in the analysis, which is just the squared difference between the estimated and actual values. Still 
another comparative measure used to compare the effectivity of the models is the Normalized Root Mean 
Squared. It is the square root of the sum of the Squared Error values divided by the sum of the squared 
actual values. The Normalized Root Mean Squared measures the relative portion of the total value of the 
data that is represented by the error. All of these comparison metrics reveal that the Error Knowledge 
Network is consistently the best neural network algorithm for modeling MEMS microengine data (as 
seen in Table 5 through Table 8). 

The Error Knowledge network modeled lower tw very well (see Table 5). It significantly 
“outperformed” all the others (e.g.. Hybrid K-Net, StatNet Selected Inputs, StatNet, Linear Regression 
and K-Nearest Neighbors) considering all of the comparative measures. 

The l 2 value was 0.9868 for the lower t 50 using Error K-Net, which demonstrates exceedingly high 
correlation between the inputs and the output. The next best algorithm was the Hybrid K-Net, which only 
had an r value of 0.8183, which is still fairly good correlation. The maximum Absolute Error was almost 
five times greater for the Hybrid K-Net compared to the Error K-Net. Similarly, the other comparison 
statistics were several orders of magnitude worse than the Error K-Net modeling results. The modeling 
approach in the Error K-Net, which uses a three-pass approach to modeling, is evidently inherently tar 
superior for the specific type of data involved. 
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Table 5. Comparison Statistics for Lower t$o 


The Error Knowledge network also accurately modeled the C for the lower distribution (see Table 6). 
The r value was 0,99(X) for the Error K-Net (approaching perfect correlation). This demonstrates the 
precision with which the network modeled the data. The next-best algorithm was the Hybrid K-Net, 
which only had an r value of 0.6306. All other neural networks schemes provided worse correlation, 
performing at far less accurate levels. 
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Table 6 . Comparison Statistics for Lower G 


The r value was 0.9563 for the upper tso using either the Error K-Net or Hybrid K-Net (see Table 7). 
These results demonstrate strong association between the inputs and the output. The reason that the 
Error K-Net performed exactly the same as the Hybrid K-Net is that a second pass of the Error K-Net 
network was not applied. 
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2.20E+08 

9.20E+16 
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0.9036 
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Table 7. Comparison Statistics for Upper tso 


The initial prediction of t 50 could not be improved upon with subsequent passes to correct any predicted 
error, and therefore the initial estimate was used as the final prediction (see Figure 16 and associated 
discussion). The StatNet Selected Inputs and regular StatNet provided almost the same results. However 
Linear Regression and K-Nearest Neighbors were significantly less accurate (r of 0.5980 and 0.4709, 
respectively). The other comparison metrics between the statistical network approaches (Error K-Net, 
Hybrid K-Net, StatNet Inputs and StatNet) were roughly an order of magnitude better than Linear 
Regression and K-Nearest Neighbors. Even though all the statistical network (i.e.. Error K-Net, Hybrid 
K-Net, StatNet, etc.) approaches for upper t w were roughly equivalent, for consistency, the Error K-Net 
was used as the standard modeling approach. 

The r value was 0.9194 for the upper <3 using the Error K-Net approach (see Table 8). This also 
demonstrates sufficiently strong correlation between the inputs and the output to verify accurate 
modeling (see Table 4). The other statistical network algorithms provide r values of around 0.785. The 
r values for the Linear Regression and K-Nearest Neighbors algorithms were far worse. Similarly, the 
other comparison statistics for the other networks were clearly less than the Error K-Net modeling 
results. 

A factor that may have contributed to the upper t 50 and upper G having r values comparatively less than 
the lower t 50 and lower G is that the lower values were duplicated for the unimodal case. Therefore, 
there may be a tendency of the upper bimodal outputs to have adverse influence from the lower statistics 
(where the unimodal case was duplicated). However, the modeling results demonstrating overall 
accuracy in its prediction of all the output distributions validates this general approach. 

For the types of correlations existing in the microengine data, the inherent capabilities of the Error K-Net 
for modeling the given data suggest it should be the standard modeling approach. Modeling with this 
approach yielded better results and consistently outperformed all the other neural network algorithms. 
The subsequent analysis will concentrate on just the Error K-Net approach. 
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Table 8. Comparison Statistics for Upper (7 


Table 9 shows statistics both on the input data and metrics related to the model’s output for the Error K- 
Net. It also shows statistics on the error and squared error (from the model’s prediction). The average 
Absolute Error ranges from just one to several orders of magnitude less than the average of the model’s 
output; therefore, the error can be considered relatively small. As discussed previously, r for all four 
outputs is quite good, all greater than 0.90 with two approaching 0.99. As seen in Table 4, these values 
indicated a strong correlation between the network inputs and output. 
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Table 9. Data Statistics for Error K Network ( selected ) 
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3.5 THE SELECTED NEURAL NETWORK 

Since the Error K-Net provided the best modeling for MEMS microengine data, this section will outline 
the details of the neural network transformations. The Error K-Net consists of a three-passes approach 
using separate networks. An initial prediction for the output parameter is made from the first pass of the 
network. Next, a second pass of another network is used to predict the error estimate in this first 
prediction. A third and final pass of a separate network is made to adjust the initial estimate by a factor 
based on the error estimate to create the final prediction. 

The Error K-Net developed only required three inputs — humidity, resonant frequency and///, to 
estimate the initial prediction for lower tso. Figure 10 shows the details of the network, specifically, how 
the inputs are transformed to make this initial prediction. The correlation equation is nonlinear and 
three-dimensional. 


X 2 

X 2 



= -0.8739 _0.7933*X 1 + 

0 3996*X 1 2 - 0.3497 *X 1 3 - 
0.507 *X 2 *X 3 + 0.2777 *X 3 2 

^ Initial 

Predicted t5Q 

LEGEND 

Xj = Humidity 
X 2 = Resonant Freq. 

x 3 «jK 





Figure 10. Initial prediction for lower t so - 

Once an initial prediction estimate is optimized for t 50 , an error estimate is formulated to determine a 
correction factor. 
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Figure 11. Error estimate for lower t 5 „. 

Figure 1 1 shows the derived equation to estimate the error in the initial prediction of t M) (Figure 10). The 
functional inputs for this transformation are operating frequency and the initial prediction found through 
the formula in Figure 10. Now the initial prediction (Figure 10 transformation) is adjusted using the 
error estimate (Figure 1 1 transformation) to make the final prediction for t<j () as seen in Figure 12. This 
transformation is a simple linear two-dimensional equation. 
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Xj = Initial Predicted tjp 
X^ = Error Estimate 


Figure 12. Final prediction for lower t so. 


27 








Figure 13 shows the details of the network for the initial prediction of the lower G, how the inputs are 
transformed into the output. The correlation equation is nonlinear and two-dimensional. As seen in 
Figure 13, the only parameters that have influence on the initial prediction are humidity and f!f„. 


Xi 
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Initial 
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LEGEND 
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Figure 13. Initial prediction for lower G. 

Once an initial prediction estimate is optimized for <7, the error is calculated in this estimate using the 
second-pass network shown below. Figure 14 shows the optimized scheme to estimate the error for the 
first prediction of <7 (see Figure 13 equation). The functional elements of this transformation are 
humidity,///, and the initial prediction found through the Figure 13 computation. 
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Figure 14. Error estimate for lower G. 

Now the initial prediction (Figure 13 transformation) and error estimate (Figure 14 transformation) are 
combined to make the final prediction for G as seen in the Figure 15 equation. Again, this final 
transformation uses a simple two-dimensional linear equation. 
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= 0.7563*X 1 - 0.60 17 *2^ 
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X 2 



X^ = Error Estimate 


Figure 15. Final prediction for lower G. 

The Error K-Net for predicting the upper tso did not benefit from the second pass (used for correcting any 
predicted error in the estimate). This error is too random to accurately predict or model. Therefore, the 
initial uncorrected estimate for t ?0 was the best estimate and could not be improved upon. Figure 16 
shows only the first pass of this neural network (which now becomes the final transformation equation). 
This network has two nodes and both are nonlinear and multivariant. The operating frequency and spring 
quotient were the only factors that effected upper mean life, t 5( >. 
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Figure 16. Final prediction for upper tw 

Figure 17 shows the details of the network for the initial prediction of the uppei G, in detail, how the 
inputs are transformed into a G prediction. 
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Figure 17. Initial prediction for upper a 

The correlation equation is nonlinear and two-dimensional. The only parameters that have influence on 
the initial prediction of upper G are resonant frequency and/// 0 . 

Once an initial prediction is optimized for G, the error in this estimate is calculated using the second-pass 
network. This second-pass network has two nodes, both are multivariant and nonlinear. Figure 18 shows 
the optimized scheme to estimate the error in the G prediction (Figure 17). The functional elements of 
this transformation are humidity and resonant frequency. 
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Figure 18. Error estimate for upper <7 . 

Now the initial prediction (Figure 17 transformation) and error estimate (Figure 18 transformation) are 
used to make the final prediction for a as seen in Figure 19. 
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Figure 19. Final prediction for upper <7. 

3.6 THE EFFECTIVITY OF THE ERROR K-NETWORK 

Statistical parameters were used to develop the actual versus predicted charts seen in Figures 20 through 
23. Figures 20 shows that the neural networks for the lower tso were modeled quite effectively and 
provide accurate predictions. There are very few deviations from the diagonal congruency line. 



OK 200k 400k KJ0K 800K lOQOK 1200K 1400K 

Actual 

Figure 20. Actual vs. estimate for lower t so- 

The congruency line (diagonal line) represents perfect correlation (where prediction exactly matches the 
actual values). As the data seen in Figure 20 show, there is not much deviation from the idealized case. 
The average error in estimation was around 6.9%. 

The correlation between actual versus predicted values for the lower c is even tighter (see Figure 21 ). 
There were two data points that were slightly off, but the majority of the predictions were near perfect. 
Errors in the collection of the data or other anomalies may explain these two minor deviations. The 
average error in the estimation was roughly 1.78%. 
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Figure 21. Actual vs. estimate for lower a 

Figure 22 shows the correlation graph between actual and predicted values for the upper t 5 „. There is 
only one data estimate that has a significant error in its prediction of ts n . This anomaly may be just an 
aberration or as before, the data may have been incorrectly collected. In addition, the errors were too 
random (as discussed in the previous sections on modeling upper Uo) to effectively correct them during 
the second pass of the Error K-Net algorithm— this erratic nature is evident in the graph. Additionally, 
some degree of error in the modeling may have been introduced by duplicating the lower distribution 
parameters (tso and G) in the upper for the unimodal case. 



Figure 22. Actual vs. estimate for upper t w. 

The one prediction where there is a significant deviation between the predicted and actual values is 
highlighted (see the circled data point in Figure 22). This data point is one in which the difference 
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between the upper and lower distribution statistics were quite significant (three orders of magnitude). 

This may help explain why there was a larger error in predicting the upper t 5() for this one point (cross 
correlation effects previously described). 

As the data show in Figure 23, there is a fair amount of scatter between the predicted and actual values of 
the upper o — slight deviations from the idealized case. However, there is sufficient accuracy in 
prediction as evidenced by an r value of 0.9194. 

All of the significant deviations were in cases in which the lower distributions were duplicated for the 
upper (all marked with a circle). Besides these four data points, the data were modeled well as 
demonstrated by the closeness of the data to the diagonal congruency line. 

As previously discussed, a reason that the correlation graphs seen in Figures 22 and 23 (upper cases) are 
not as accurate as the graphs for the lower parameters may be that the unimodal data were duplicated for 
the upper parameter in certain cases. However, even though irregularities may be introduced into the 
modeling by using this approach, the overall effects are positive and the benefits seem to far outweigh 
any detriments. 



Figure 23. Actual vs. estimate for upper O. 


3.7 TREND ANALYSIS WITH NEURAL NETWORK PREDICTIONS 

After modeling was complete, the neural networks were used to determine the influence that the input 
parameters have on the corresponding four reliability output parameters. For each of the analysis graphs 
shown (Figures 24 through 33), a different input parameter was varied while the others were held 
constant. This allowed a comparison to be made for each parameter and gained us insight into the 
sensitivities and effects that the selected (varying) parameter has on the overall reliability of the MEMS 
device. 

In Figure 24 (lower U {) Network), the humidity was varied while other input parameters were held 
constant. Specifically, the operating frequency was set to 1720 Hz, resonant frequency set to 1 150 Hz, 
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flfo ratio set to 1 .496, spring quotient set to 1 804, and the tangential force component set to 2.5. The 
graph shows the interrelationship between humidity and the lower t w , specifically, how a very low 
humidity has a dramatic positive effect on the life of a MEMS microengine. There is a sharp decrease in 
life as the humidity is increased (probably caused by stiction and adhesion). In addition, there is a slight 
increase in the median life between humidity levels of 30% to 60%. Since the networks were trained 
with a relatively small sample size, these slight deviations may actually be insignificant. However, the 
general conclusion of low humidity increasing the reliability of the MEMS microengines will apply. 


Lower t 50 Prediction vs. Humidity 



Figure 24. Lower t^o predictions based on humidity changes. 

In Figure 25, the same testing methods were used, where humidity was varied as all the other inputs were 
held constant (the specific values are the same as above). The graph shows that at lower humidity levels, 
the characteristic shape is larger, but becomes tighter as the humidity is increased. Again, the noise 
within this eraph may have to be overlooked since it could be a product of the smaller training set size. 

As seen in the Figure 16 equation, humidity is not a factor in the upper median life, t w . Therefore, a 
correlation chart between humidity and the upper median life was not generated. 

Figure 26 shows how humidity affects the upper characteristic shape parameter, a. Very low humidity 
and humidity around 40% result in smaller characteristic shape values. 
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Lower a Prediction vs. Humidity 



Figure 25. Lower a predictions based on humidity changes. 


Upper a Prediction vs. Humidity 



Humidity (%) 


Figure 26. Upper O predictions based on humidity changes. 

Figure 27 shows the effects operating frequency has on the lower median life, ts 0 . For this analysis, 
humidity was held at 35% (average indoor), resonant frequency (fo) held to 1150 Hz, spring quotient set 
to 1804, and tangential force component set to 2.5. The operating frequency (f) was varied from 800 Hz 
to 2050 Hz, while the ratio///o was set appropriately. The results show that the reliability of the 
microengines survives longer when operating at either the resonant frequency or at roughly half the 
resonant frequency. As mentioned earlier, the noise in the graph may be a result of the small training set 
size. 
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Lower tgo Prediction vs. Op Frequency 



800 1000 1200 1400 1600 1800 2000 

Operating Frequency (Hz) 

Figure 27. Lower t so predictions based on operating frequency changes. 

Figure 28 was developed by varying operating frequency between 800 Hz and 2050 Hz while the other 
input parameters were held constant at the same values as above (Figure 27 analysis). The optimal 
operating frequency for a tighter characteristic life seems to be around 1000 Hz. 


Lower a Prediction vs. Op Frequency 



800 1000 1200 1400 1600 1800 2000 

Operating Frequency (Hz) 

Figure 28. Lower a predictions based on operating frequency changes. 

In Figure 29, operating frequency was once again varied while other inputs were held constant. The 
results for the upper mean life seem to be the opposite as for the lower. The most damaging operating 
frequency seems to occur at resonant frequency. 
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Upper tso Prediction vs. Op Frequency 



Figure 29. Upper t 50 predictions based on operating frequency changes. 

As mentioned earlier, one of the reasons for the bimodality of the distributions may be some degree of 
variability in the microengines (resonant frequency may vary). The upper mode may be for the 
“stronger” engines that have a true resonant frequency higher than 1500 Hz. Therefore the upper mode 
engines may not survive when operating at an intermediate frequency. 

Figure 30 shows the trend analysis for upper <3 versus operating frequency with all input parameters held 
constant. The operating frequency was varied from 800 Hz to 2050 Hz while the other input parameters 
were held constant as defined in the previous analysis. The smallest characteristic shape is achieved with 
either low (around 800 Hz) or high (around 2000 Hz) operating frequencies. As the microengines are 
operating close to the resonant frequency, there seems to be greater variability in the failure times. 



800 1000 1200 1400 1600 1800 2000 

Operating Frequency (Hz) 

Figure 30. Upper G predictions based on operating frequency changes. 
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The Figure 31 analyses were conducted by varying the resonant frequency from KXK) Hz to 2000 Hz. 

The humidity was held constant at 35%, the operating frequency held at 1720 Hz, the spring quotient at 
1804, and the tangential force component at 2.5. The trend analysis seen in Figure 31 suggests that 
optimal resonant frequency is large (greater than 1900 Hz). However, the microengine test data used to 
train the networks only had two different resonant frequencies, 1 150 Hz and 1500 Hz. A larger variety of 
resonant frequencies will have to be obtained before the trend analysis becomes more meaningful. 


Lower tso Prediction vs. Resonant Frequency 



1000 1200 1400 1600 1800 2000 
Resonant Frequency (Hz) 

Figure 31. Lower t- >n predictions based on resonant frequency changes. 

Figure 32 shows the variation of the lower <7 with changes in resonant frequency. As with the previous 
analysis, resonant frequency was varied from 1000 Hz to 2000 Hz while all the other input parameters 
were held constant. The results of the trend analysis show that the tightest G’s are obtained at the higher 
resonant frequencies. Microengines with lower resonant frequencies tend to have more variability in 
their failure times. 


Lower c Prediction vs. Resonant Frequency 



Figure 32. Lower O predictions based on resonant frequency changes. 
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As seen in the transformation equation of Figure 16, resonant frequency does not influence the prediction 
of the upper median life, t 50 . Figure 33 shows the effects that resonant frequency has on the upper c. 
Resonant frequency was varied while the other input parameters were held constant. Any value above 
15(X) Hz seems to provide a smaller value for the characteristic shape. There is a fair amount of 
variability at a resonant frequency of 1300 Hz. 


Upper o Prediction vs. Resonant Frequency 



Resonant Frequency (Hz) 


Figure 33. Upper o predictions based on resonant frequency changes. 

Note that the results from the analysis done in this section only show trends for the specific values of the 
parameters held constant. Different trends will exist if using different values for these fixed parameters. 
In addition, a relatively small training set size was used; therefore, higher resolution results will be 
obtained when the analysis is repeated with larger amounts of training data. With smaller training sets, 
the results will contain some noise, and erratic predictions. Finally, of the parameters that were varied 
during testing and data collection, most were only varied minimally and not through a complete range. 
This may also limit the results and effect of the modeling. 

Regardless, the results of the analysis outlined in Figures 24-33 showed trends that were expected and 
previously demonstrated (Tanner, 2000). We did not perform trend analysis on the spring quotient and 
the tangential force component, since these parameters lacked enough variability. Microengines were 
tested with only two values for the spring quotient, 1 804 and 1805. The tangential force component had 
no variability; all testing was done at 2.5. 
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SECTION 4: SUMMARY AND CONCLUSIONS 


Both commercial and educational laboratories throughout the world are fabricating MEMS - funding and 
development is exponentially growing as industry realizes its potential. These devices may become one 
of the key defining technologies of the upcoming decade. They are essentially a hybrid of electrical and 
mechanical systems at the micron level. MEMS devices are generally batch-fabricated, in large 
quantities, with economies of scale driving unit cost similar to ICs. In addition, the low/no-touch 
fabrication process of MEMS can create reliable systems with precision. 

MEMS devices are a promising and emerging technology due to the potential to significantly alter many 
applications. MEMS have received substantial support for research and development throughout the 
world and will revolutionize sensing and control in automotive, medical, space, military, 
telecommunication, computing, industrial, and recreational applications. 

The next step in the silicon revolution could be the widespread use of MEMS devices in many 
commercial and government applications, especially in the optics and communication environments. See 
Figure 34 for an example of a MEMS optical mirror. In this example, a microengine is used to drive a 
hinged mirror, which could be used as a keyed arming lock or even as an optical relay switch. 



Figure 34. Sandia microengine driving a micro-mirror/Courtesy of Sandia National Labs, 

MEMS research and development is rapidly progressing in high-technology applications — where low- 
cost, high-reliability, small-size, and low-power attributes can have dramatic benefits. Just like in the IC 
field, the primary economic driver for MEMS is cost. Low cost will ensure its rapid integration into 
commercial and government applications. 
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4.1 SUMMARY OF PROBLEM 

An integral part of any development process is being able to quantify the reliability of the device during 
conceptual design. At the conceptual design phase of a project, before the MEMS devices are 
manufactured, traditional methods of determining reliability are inadequate since quantification through 
testing is not possible. Design engineers need a methodology for estimating MEMS reliability early in 
concept design. 

To guide MEMS process development through reliability evaluations, MEMS reliability must be 
quantified. Such a reliability estimate must be based on data available at the early design phase of a 
project — data about the fabrication process, design characteristics and physical attributes and 
performance expectations from the device, including parameters related to the operating environment, 
and packaging. The neural networks reliability modeling techniques developed within this research 
should provide an ideal mechanism to translate these attributes into a predictive reliability tool. 

To quantify the reliability of a MEMS component, we must consider not only the device itself, but also 
the entire process surrounding the part, from detail design, fabrication, packaging schemes, testing, and 
ultimately the environment in which the device will operate. This means that the development process 
must be qualified and effectively modeled, including the fabrication process, quality standards, and 
fabricator’s experience. 

4.2 A PROPOSED SOLUTION 

The research performed in this research developed MEMS reliability models based on neural networks. 
These predictive neural networks can be used in the design process to optimize the overall reliability. 
Specifically, these networks can provide insight into what design, fabrication, operating and packaging 
attributes are significant determinants of overall reliability (can easily perform sensitivity analysis with 
the results of the modeling). 

4.3 DATA SOURCE 

A common obstacle to research of this type is the lack of readily available data, both quantity and quality, 
that are needed for adequate training of the neural networks. Very little obtainable data on MEMS 
reliability exists, since most commercial manufacturers consider their reliability data proprietary. Most 
universities and research institutions do not have the quantity of similar data required to adequately train 
a neural network. However, Sandia National Laboratories in Albuquerque, New Mexico, has been 
designing and manufacturing MEMS components for several years. Sandia is also emphasizing MEMS 
research because of their characteristics of low cost, high reliability, low power consumption, miniature 
size, and low and weight. 

Sandia is very interested in the novel modeling approach developed in this effort and provided access to 
their reliability data. Most of the reliability data are from the same basic MEMS design, with only design 
and operating environment parameters varied. Even though this is a limited test sample, it has provided 
an excellent basis to determine the feasibility of this modeling approach. 
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4.4 MODELING WITH NEURAL NETWORKS 

The general approach to developing neural networks to predict MEMS reliability consists of 
decomposing the system to its component level (gears, gyros, springs, etc.), then selecting which MEMS 
component attributes have a correlation to that component’s reliability. Next, data on these attributes as 
well as component reliability are collected through automated testing. 

After all the input and output data are collected, the neural networks are trained with the inputs 
(attributes) and the outputs — a different network for each type of component. The output was defined to 
be the reliability distribution, specifically the shape parameters of the selected distribution. For this 
research, as mentioned earlier, the Sandia microengine was used because it was the only one available 
with sufficient data. After analyzing the failure data, the log-normal distribution seems to best fit the 
Sandia microengines, therefore, the mean life, t w , and shape parameter, G, was used as the output 
parameters. 

Before training commenced, the attribute and reliability data was randomly partitioned into two sets: the 
training data (the majority) and validation data (the remainder). A neural network was then applied to 
the training partition (both attribute and reliability statistics are used to train the networks) the 
attributes eventually become the system inputs and reliability, the system output. As previously 
discussed, attribute data consist of any parameter that might have a correlation to overall reliability, i.e., 
fabrication process details, physical specifications, operating environment, property characteristics, or 
packaging. 

During the training process, the neural networks heuristically determine the actual correlation between 
the attributes and the reliability statistics. After the networks are trained, the validation data is used to 
verify that the neural networks provided accurate reliability predictions — independent validation that the 
neural network is accurately predicting reliability. Note that during testing with the validation data, only 
the input data are provided to the model. Then the output from the model (the reliability estimate) is 
compared to the real reliability value known from testing. If there is consistently good correlation 
between the estimates and the known values, the model can be used as a predictive tool for MEMS 
reliability. After validation, we can estimate reliability of a newly proposed MEMS device by 
decomposition and using the appropriate trained neural networks. 

The modeling can be ineffective for several reasons. First, insufficient correlation could result when the 
networks are not trained with enough data, or because not enough of the correct inputs were specified. 
Possibly the data transformations or segmentation were inadequate. Aberrations in the data (miscollected 
or faulty data) could also skew the results. However, from the results obtained through modeling of the 
MEMS microengines, the corresponding networks yielded excellent results with very good correlation 
present. 

4.5 SUMMARY OF FINDINGS 

The roughly 800 MEMS microengine failure data were portioned into common sets. Common sets are 
those that have the same input values. Each common set was then fit to a probability distribution. The 
log-normal seemed to provide the best modeling results. Upon closer inspection, some of the set of data 
exhibited a bimodal tendency. Therefore the data were segregated into upper and lower sets to account 
for the bimodality. We tested several neural networks using these sets to determine which would model 
MEMS microengine reliability. After extensive testing, the Error Knowledge Networks, a form of a 
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statistical network, provided the best results. Furthermore, the modeling results showed that all output 
parameters were strongly correlated. All the r values for the four output parameters were greater than 
0.90. The neural network transformations from the input parameters to the four output reliability 
statistics were performed using a three-pass statistical network. Each network pass consisted of either 
one or two nodes. The transformations used both linear or nonlinear multivariant equations. 

The network predictions for the output statistics were plotted against the actual values. The lower 
distribution showed outstanding results with very few minor prediction errors. The upper distribution 
had slightly larger prediction errors but this may have been a result of the methodology employed. 
Specifically, for distributions that were unimodal, the values were duplicated during training to create 
both the upper and lower input parameters. This may have resulted in a slight skew of the results. 
However, overall, the modeling of MEMS reliability using neural networks was highly effective even 
considering the approach to model bimodality. 

After the modeling was completed, including validating the results, we used the networks to perform 
sensitivity analysis. The first parameter analyzed was humidity. Low humidity showed the best results 
on overall microengine reliability. Next, operating frequency analysis showed that operating at either 
half or full resonant frequency had the best overall effects on microengine life. Further analysis showed 
that microengines with high resonant frequencies typically lasted longer. 

Results from the original modeling and sensitive analysis can be used to optimize microengine design. 
When more input parameters are defined and data collected on them, the optimal combination of 
parameters can be derived. With this insight, designers can optimize future microengine design. 

4.6 AREAS FOR FURTHER RESEARCH 

One of the major drawbacks to the current modeling effort is the number of data samples used for 
modeling. For this research, only 787 microengines were used. This data condensed to 15 data 
distributions, which were used to train the neural networks. Ideally, there should be a few thousand 
microengines tested to failure to build about 100 different distributions. The resulting neural networks 
will provide more accurate and robust modeling of the reliability statistics. This methodology should be 
repeated as more data are obtained. 

The limited input parameters further constrained the data: there were only 8 different input parameters. 
However, the number of microengine attributes that influence reliability is vastly larger. More detailed 
analysis should be conducted to identify and collect all reliability-dependent attributes. 

Furthermore, for each identified attribute, testing should be conducted with a more systematic approach. 
Testing should be coordinated so that each parameter can be varied through a predefined range. At least 
10 data points should be collected for each setting of the input parameters, while holding all other 
parameters constant. For example, for humidity, 10 microengines should be tested to failure for each 
humidity level selected. Humidity should be tested at regular, precise settings, such as 0%, 10%, 20% ... 
100% while all other input parameters are held constant. 

Precise data should be collected for all input parameters defined. For instance, precise failure times 
should be collected whenever feasible, as opposed to “ranged” data. The process should be automated so 
that failure is known within a resolution of a few 100 cycles. An automated process that monitors 
operating frequency or other attributes signaling failure should be incorporated. Also, other parameters 
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like resonant frequency need to be precisely determined for each microengine. Average resonant 
frequencies may not provide accurate assessment of its influence on overall operational life. 

After collecting the data, additional data transformations and feature extractions should be attempted to 
iesure comprehensive modeling. Certain additional features can extenuate modeling results as discussed 
in the Neural Network section above. 

Eventually, as more MEMS data are accessed and incorporated into the modeling scheme, decomposition 
into MEMS components can be fully realized. Currently, the whole microengine is modeled in one 
network. However, in the future, complete MEMS systems should be segregated and decomposed into 
individual components before training, testing, and application of the neural networks. Currently 
insufficient data exist to expand the functionality of the networks to this level. 

The methodology employed in this research to account for the bimodality of the probability distribution 
should be further investigated. Even though the current methodology yielded good results, other 
approaches to model this characteristic should also be developed and compared. 

4.7 CONCLUSION 

Extensive research into the development of reliability modeling techniques using neural networks has 
been performed in this research. Using comprehensive reliability data from Sandia National Laboratories 
has enabled us to develop, test, and valide this prediction methodology. The preliminary results of this 
research suggest that use of the techniques described herein may be used to accurately estimate the 
reliability of proposed MEMS devices during the concept design phase (even before they even exist). 
Such tools may therefore be used as feedback into the design and development of new MEMS devices to 
ensure that the ultimate end product has a higher likelihood of being robust and reliable. 
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APPENDIX: SANDIA MICROENGINE DATA 
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